image.png

Deliverables:

  • Submit two files that are labeled: YourLastName_Assignment_3 that have the following formats:

    1. Your HTML that has your Source code and output
    2. Your ipynb script that has your Source code and output

Objectives:

  • Use SQL to execute different queries to retrieve data from Chicago Crime dataset and Police statins dataset
  • Use Geospatial queries to locate police stations and gun related crimes (with arrest or no arrest) in every district on Choropleth map
  • Use Geospatial queries to provide descriptive stat for every district on Choropleth map
  • Use Geospatial queries to locate the Block that is the furthest (Maximum Distance) from the police station that has gun related crime resulted in arrest

Formatting Python Code When programming in Python, refer to Kenneth Reitz’ PEP 8: The Style Guide for Python Code: http://pep8.org/ (Links to an external site.)Links to an external site. There is the Google style guide for Python at https://google.github.io/styleguide/pyguide.html (Links to an external site.)Links to an external site. Comment often and in detail.

Descriptions and Requirement Specifications

Chicago Crimes

In his first state of the uniion address , president Trump mentioned Chicago violance 10 times Trump's State of the Union Address

Chicago has more homicides than New York and Los Angeles combined

Columnist Clarence Page wrote an article , published by the Chicago Tribune stated that the city of Chicago had more homicides in the past two years than New York and Los Angeles combined

Chicago Police Department

Chicago police department CPD issues and publishes on daily basis on its website crime alerts, and press releases for the different districts .

image.png

The CPD categoizes the crimes into 8 categories as follows:

image.png

Chicago Crimes Dataset

The CSV file for crimes dataset for the city of Chicago is obtained from the data portal for the city of Chicago. Here is the link for the city of Chicago data portal City of Chicago Data Portal

image.png

Accessing the data

Three set of data are need for this assignment:

  1. The Chicago police stations in every district (hosted on NW server)
  2. The Crimes dataset (hosted on NW server)
  3. The Boundaries.geojson data for district boundries (downloaded in Zip file)

Complete description of the dataset can be found on Chicago city data portal.

Based on Trumps State of the Uniion Address and the article written by columnist Clarence Page and published by the Chicago Tribune, we are interested to retrieve the data for the past two years and perform different types of spatial queries.

There are few of these queries that we are interested in to help CPD and city of Chicago to plot on a Choroplteh map those districts that have highest gun crimes.

Here are examples of those types of queries:

  1. Plot on Choropleth map the districts and their Violent Crimes
  2. Plot on Choropleth map the districts and their Gun related crimes
  3. Which district is the crime capital of Chicago districts?
  4. What the crime density per district?
  5. Plot on Choropleth map those gun related crimes that resulted in arrests
  6. Plot on Choropleth map the gun related crime that is in the farthest Block from the policy stattion for every district

Packages you need to Connect PostgreSQL server to load and retrieve Crhicago Crime dataset from the database:

  1. psycopg2: for PostgreSQL driver
  2. area: to calculate the area inside of any GeoJSON geometry
  3. Folium: for Choropleth maps </font>

Since we are using PostGIS in our work, please read and bookmark Chapter 4. Using PostGIS: Data Management and Queries

In [1]:
import sys
!{sys.executable} -m pip install folium
!{sys.executable} -m pip install area
!{sys.executable} -m pip install psycopg2-binary
import folium
from folium import plugins
from folium.plugins import MarkerCluster
import psycopg2
import csv
import pandas as pd
import json
from area import area

from psycopg2.extensions import ISOLATION_LEVEL_AUTOCOMMIT 
Requirement already satisfied: folium in /opt/anaconda3/lib/python3.7/site-packages (0.11.0)
Requirement already satisfied: jinja2>=2.9 in /opt/anaconda3/lib/python3.7/site-packages (from folium) (2.10.3)
Requirement already satisfied: numpy in /opt/anaconda3/lib/python3.7/site-packages (from folium) (1.17.2)
Requirement already satisfied: requests in /opt/anaconda3/lib/python3.7/site-packages (from folium) (2.22.0)
Requirement already satisfied: branca>=0.3.0 in /opt/anaconda3/lib/python3.7/site-packages (from folium) (0.4.1)
Requirement already satisfied: MarkupSafe>=0.23 in /opt/anaconda3/lib/python3.7/site-packages (from jinja2>=2.9->folium) (1.1.1)
Requirement already satisfied: certifi>=2017.4.17 in /opt/anaconda3/lib/python3.7/site-packages (from requests->folium) (2019.9.11)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /opt/anaconda3/lib/python3.7/site-packages (from requests->folium) (1.24.2)
Requirement already satisfied: idna<2.9,>=2.5 in /opt/anaconda3/lib/python3.7/site-packages (from requests->folium) (2.8)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /opt/anaconda3/lib/python3.7/site-packages (from requests->folium) (3.0.4)
Requirement already satisfied: area in /opt/anaconda3/lib/python3.7/site-packages (1.1.1)
Requirement already satisfied: psycopg2-binary in /opt/anaconda3/lib/python3.7/site-packages (2.8.5)
In [2]:
#folium.__version__

IMPORTANT NOTES:

  • Use this version of the assignment if you don't want to install PostgreSQL server on your laptop/computer locally to experiment with datbase and tables creation
  • In this version of the assignment you will be connecting to DSCC PostgreSQL server that has the database and tables already created on the server.
  • You need to use your NetID and password for login and passward to connect PostgreSQL server hosted on DSCC
  • For the psycopg2.connect statements listed below, you must provide your NetID and password in order to connect to PostgreSQL server hosted on DSCC
In [37]:
# All data
db_connection = psycopg2.connect(host='129.105.208.229',dbname="chicago_crimes_ut", user="" , password="")

# ----------------------------------------------------------------
#   full data 
#db_connection = psycopg2.connect(host='129.105.208.229',dbname="chicago_crimes", user="YourNetID" , password="YourPassword")
# -----------------------------------------------------------------
cursor = db_connection.cursor()

Chicago Crimes Dataset

The Crimes_2001_to_present.csv is downloaded from Chicago data portal and it has roughly 6.5 million records.

While working on this dataset, It is prudent to make a note of the following:

  1. Geospatial queries are very demanding for system resouces like CPU, Memory, and DISK
  2. We are interested in the data set of the past 2 years, and when you execute Geospatial type queries, please be advised that these queries slow down your machine.
  3. Running this script to work on the data of the past 2 years will require roughly 25 minutes to complete. And requires roughly 40 minutes to complete using the dataset of the past 5 years. And requires hours to complete on the entire dataset with at least 16GB memory.
  4. It is a good idea to take a slice (past two years) of the dataset and store it, that will help improve perfoamnce significantly especialy for SEARCH and SORT algorithms that are utilized by the database engine.

Algorithm Performance

  • Sort algorithms used by the database engines vary in performance between O($N log N$) and O($ N^{2} $) where $N$ is the size of the number

  • Search algorithms used by the database engines vary in performance between O($log N$) and O($ N $) where $N$ is the size of the number

Lets start executing different Queries

Query #1:

  • Calculate the total number of crimes in every district and plot that on Choropleth map
In [4]:
cursor.execute("SELECT district, count(district) \
               FROM crimes \
               GROUP BY district")
rows=cursor.fetchall()
In [5]:
# put returned data into a dataframe
crimes_per_district = pd.DataFrame(rows, columns=['dist_num','number_of_crimes'])
crimes_per_district['dist_num'] = crimes_per_district['dist_num'].astype(str)

crimes_per_district.head()
Out[5]:
dist_num number_of_crimes
0 24 412
1 11 1113
2 8 886
3 19 608
4 25 726
In [6]:
# set up map
total_number_of_crimes_per_district_map = folium.Map(location =(41.8781, -87.6298),zoom_start=11)
In [7]:
# load in geo data from geojson file
# set map up with parameters
total_number_of_crimes_per_district_map.choropleth(geo_data="Boundaries.geojson", 
              fill_color='OrRd', 
              fill_opacity=0.5, 
              line_opacity=1,
              data = crimes_per_district,
              key_on='feature.properties.dist_num',
              columns = ['dist_num', 'number_of_crimes'],
              legend_name = ' Crimes per District')
/opt/anaconda3/lib/python3.7/site-packages/folium/folium.py:415: FutureWarning: The choropleth  method has been deprecated. Instead use the new Choropleth class, which has the same arguments. See the example notebook 'GeoJSON_and_choropleth' for how to do this.
  FutureWarning
In [8]:
# start with Police stations, but eliminate HQ
cursor.execute("""SELECT ST_X(ST_AsText(Where_IS)), ST_Y(ST_AsText(Where_IS)), district \
            FROM police_stations \
            WHERE district!='Headquarters'""")
police_stations = cursor.fetchall()

# loop through police stations
for police_station in police_stations:
    police_station_location = (police_station[0],police_station[1])
    cursor.execute("SELECT district, count(district) \
                    FROM crimes \
                    WHERE district= %s \
                    GROUP BY district",[police_station[2]])
    districts_crime_numbers = cursor.fetchall()
    for district in districts_crime_numbers:
        folium.Marker(location = police_station_location, 
        popup = folium.Popup(html="District No : %s  has   Total Number of Crimes:%s"
        %district ,max_width=450)).add_to(total_number_of_crimes_per_district_map)
  • Lets plot the Choropleth map and notice the intensity of color on the different districts
  • The Blue POPUP represents the location of police station in the different districts in the map
In [9]:
# show the map
total_number_of_crimes_per_district_map
Out[9]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Query #2:

  • Calculate the total number of violent crimes in every district and plot that in a table on Choropleth map

Well, we really need only the violent crimes per district, so we will filter only those crimes that we are interested in. Please note that we are not interested to plot property crimes, we are really after violent crimes and in particular Gun related crimes.

So for now, lets plot violent crimes on Choropleth map and later on we will filter only Gun related crimes

In [10]:
violent_crime_categories='THEFT','ASSAULT','ROBBERY','KIDNAPPING','CRIM SEXUAL ASSAULT','BATTERY','MURDER'
In [11]:
# isolate violent crime as defined above
cursor.execute("SELECT district, count(district)\
                FROM crimes \
                WHERE PRIMARY_TYPE in %s \
                GROUP BY district",[violent_crime_categories])
rows=cursor.fetchall()
violent_crime_data=pd.DataFrame(rows, columns=['district_num','number_of_violent_crimes'])
violent_crime_data['district_num'] = violent_crime_data['district_num'].astype(str)
violent_crime_data
Out[11]:
district_num number_of_violent_crimes
0 8 488
1 11 407
2 24 216
3 19 368
4 25 362
5 4 400
6 14 309
7 3 387
8 17 220
9 20 147
10 22 217
11 9 312
12 7 367
13 10 362
14 1 624
15 5 293
16 18 605
17 2 349
18 16 252
19 15 291
20 6 457
21 12 430
In [12]:
violent_crimes_per_district_map= folium.Map(location =(41.8781, -87.6298),zoom_start=11)
violent_crimes_per_district_map.choropleth(geo_data="Boundaries.geojson", 
              fill_color='YlOrRd', 
              fill_opacity=0.5, 
              line_opacity=1,
              data = violent_crime_data,
              key_on='feature.properties.dist_num',
              columns = ['district_num', 'number_of_violent_crimes'],
              legend_name="Violent Crime Map"
              )
In [13]:
cursor.execute("""SELECT ST_X(ST_AsText(Where_IS)), ST_Y(ST_AsText(Where_IS)), district \
                FROM police_stations \
                WHERE district!='Headquarters'""")
police_stations = cursor.fetchall()

for police_station in police_stations:
    police_station_location =(police_station[0],police_station[1])
    cursor.execute("SELECT PRIMARY_TYPE, count(PRIMARY_TYPE) \
                    FROM crimes where district = %s AND PRIMARY_TYPE in %s \
                    GROUP BY PRIMARY_TYPE",[police_station[2],violent_crime_categories])
    data = cursor.fetchall()
    violent_crimes_per_district_df = pd.DataFrame(data, columns=['Description', 'Number of Violent Crimes'])
    header = violent_crimes_per_district_df.to_html(classes=
                    'table table-striped table-hover table-condensed table-responsive')
    folium.Marker(location=police_station_location, popup=folium.Popup(html="District Number %s - Violent Crimes %s"
                    %(police_station[2],header))).add_to(violent_crimes_per_district_map)
In [14]:
violent_crimes_per_district_map
Out[14]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Query #3:

  • Calculate the total number of gun related crimes in every district and plot that in a table on Choropleth map

Lets first create a dataframe of gun crimes per district first to get an idea about the number of gun crimes per district

In [15]:
# isolate gun crime
gun='%GUN%'
cursor.execute("SELECT district, count(district)\
                FROM crimes\
                WHERE DESCRIPTION::text LIKE %s GROUP BY district",[gun])
districts_gun_violent_crimes = cursor.fetchall()
districts_gun_violent_crimes_df = pd.DataFrame(districts_gun_violent_crimes, columns=['dist_num','gun_crimes'])
districts_gun_violent_crimes_df['dist_num'] = districts_gun_violent_crimes_df['dist_num'].astype(str)
districts_gun_violent_crimes_df
Out[15]:
dist_num gun_crimes
0 24 9
1 8 49
2 11 69
3 19 11
4 25 43
5 4 50
6 14 12
7 3 57
8 17 7
9 20 5
10 22 24
11 10 58
12 9 39
13 7 67
14 1 6
15 5 44
16 18 14
17 2 34
18 16 12
19 15 34
20 6 63
21 12 38
In [16]:
districts_gun_violent_crimes_map = folium.Map(location =(41.8781, -87.6298),zoom_start=11)
districts_gun_violent_crimes_map.choropleth(geo_data="Boundaries.geojson", 
              fill_color='YlOrRd', 
              fill_opacity=0.5, 
              line_opacity=1,
              data = districts_gun_violent_crimes_df,
              key_on='feature.properties.dist_num',
              columns = ['dist_num', 'gun_crimes'],
              legend_name="Gun Crime"
              )

Now, lets create a dataframe of the different types of gun crimes for every district and then plot it on Choropleth map

In [17]:
cursor.execute("""SELECT ST_X(ST_AsText(Where_IS)), ST_Y(ST_AsText(Where_IS)),
    district from police_stations where district!='Headquarters'""")

gun='%GUN%'
police_stations = cursor.fetchall()

for police_station in police_stations:
    police_station_location = (police_station[0],police_station[1])
    cursor.execute("""SELECT DESCRIPTION, count(DESCRIPTION) \
                    FROM crimes \
                    WHERE district=%s and DESCRIPTION::text LIKE %s GROUP BY DESCRIPTION""",[police_station[2],gun])
    district_gun_violent_crimes=cursor.fetchall()
    district_gun_violent_crimes_df=\
        pd.DataFrame(district_gun_violent_crimes, columns=['Description', 'Number of Gun Crime'])
    header = district_gun_violent_crimes_df.to_html(classes=
        'table table-striped table-hover table-condensed table-responsive')
    folium.Marker(location=police_station_location,popup=folium.Popup(html="District No: %s GUN_Crime: %s"
        %(police_station[2],header) )).add_to(districts_gun_violent_crimes_map)
    
In [18]:
districts_gun_violent_crimes_map
Out[18]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Query #4:

  • Calculate the crime density per district
In [19]:
district = []
tarea=[]

with open('Boundaries.geojson') as f:
    data = json.load(f)
    a = data['features']
    for i in range(len(a)):
        obj=a[i]['geometry']
        n= a[i]['properties']
        district.append(n['dist_num'])
        tarea.append(area(obj)/10000)
af = pd.DataFrame({'dist_num': district,'district_area_inHectares':tarea})
af['dist_num'] = af['dist_num'].astype(str)
final_data= pd.merge(af, crimes_per_district, on='dist_num', how='inner')
final_data['crime_density'] = round(final_data['number_of_crimes']/(final_data['district_area_inHectares']/100))
final_data
Out[19]:
dist_num district_area_inHectares number_of_crimes crime_density
0 17 2492.727155 386 15.0
1 20 1132.170216 254 22.0
2 19 2225.035732 608 27.0
3 25 2827.989237 726 26.0
4 14 1555.869965 508 33.0
5 7 1688.670732 745 44.0
6 3 1576.063931 695 44.0
7 4 7068.152865 790 11.0
8 6 2099.682124 844 40.0
9 22 3490.416073 442 13.0
10 5 3318.613379 635 19.0
11 24 1406.081387 412 29.0
12 16 8171.776367 473 6.0
13 8 5992.169760 886 15.0
14 18 1215.520046 869 71.0
15 12 2509.453028 700 28.0
16 11 1582.727274 1113 70.0
17 15 989.631393 530 54.0
18 10 2038.988883 779 38.0
19 1 1214.818895 854 70.0
20 9 3505.216898 585 17.0
21 2 1949.690970 602 31.0

Query #5:

  • Create Marker Clusters on Choropleth map for those gun related crimes that resulted in arrest (green icon) and those that didn't (red icon)
In [20]:
gun_crime_arrests_map = folium.Map(location =(41.8781, -87.6298),zoom_start=11)
gun_crime_arrests_map.choropleth(geo_data="Boundaries.geojson", 
              fill_color='YlOrRd', 
              fill_opacity=0.5, 
              line_opacity=1,
              data = districts_gun_violent_crimes_df, #can use data from #3
              key_on='feature.properties.dist_num',
              columns = ['dist_num', 'gun_crimes'],
              legend_name="Gun Crime"
              )
In [21]:
cursor.execute("""SELECT ST_X(ST_AsText(Where_IS)), ST_Y(ST_AsText(Where_IS)),
    district from police_stations where district!='Headquarters'""")
gun='%GUN%'

police_stations = cursor.fetchall()

marker_cluster = MarkerCluster().add_to(gun_crime_arrests_map)

for police_station in police_stations:
    police_station_location = (police_station[0],police_station[1])
    cursor.execute("""SELECT DISTINCT ON(caseno) caseno, block,DESCRIPTION, count(arrest), arrest,latitude,longitude\
                    FROM crimes\
                    WHERE district=%s and DESCRIPTION::text LIKE %s \
                    GROUP BY caseno,block, DESCRIPTION,arrest, latitude, longitude""",[police_station[2],gun])
    crimes_per_district = cursor.fetchall()
    for crime in crimes_per_district:
        if crime[4]==True:
            folium.Marker(location=(crime[5],crime[6]),popup=
                    folium.Popup(html="District No: %s <br> Description: %s <br> Block: %s"
                    %(police_station[2],crime[2],crime[1])),icon=folium.Icon(color='green',
                    icon='ok-sign'),).add_to(marker_cluster)
        else:
            folium.Marker(location=(crime[5],crime[6]),popup=folium.Popup(html=
                    "District No: %s <br> Description: %s<br> Block: %s"
                    %(police_station[2],crime[2],crime[1])),icon=folium.Icon(color='red',
                    icon='remove-sign'),).add_to(marker_cluster)

            
In [22]:
gun_crime_arrests_map
Out[22]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Query #6:

  • Plot on Choropleth map the farthest Block that has a gun crime from every police station in every district
In [23]:
farthest_block_gun_crime_map = folium.Map(location =(41.8781, -87.6298),zoom_start=11)
farthest_block_gun_crime_map.choropleth(geo_data="Boundaries.geojson", 
              fill_color='YlOrRd', 
              fill_opacity=0.5, 
              line_opacity=1,
              data = districts_gun_violent_crimes_df,
              key_on='feature.properties.dist_num',
              columns = ['dist_num', 'gun_crimes'],
              legend_name="GUN CRIME"
              )
In [24]:
cursor.execute("""SELECT ST_X(ST_AsText(Where_IS)), ST_Y(ST_AsText(Where_IS)),
    district from police_stations where district!='Headquarters'""")
police_stations = cursor.fetchall()

gun='%GUN%'

for police_station in police_stations:
    cursor.execute("""SELECT DISTINCT on (A.block) A.district,A.block, A.where_is,ST_Distance(A.where_is,B.where_is)\
    FROM crimes as A, police_stations as B 
    WHERE A.district=%s and DESCRIPTION::text LIKE %s and B.district= %s
                             and ST_Distance(A.where_is,B.where_is) 
    IN 
        ( SELECT dist
        FROM 
            (SELECT ST_Distance(A.where_is,B.where_is) as dist 
            FROM crimes as A, police_stations as B 
            WHERE A.district=%s and DESCRIPTION::text LIKE %s and B.district= %s ) as f)""",
                   [police_station[2],gun,police_station[2],police_station[2],gun, police_station[2]])
    
    farthest_block_gun_crime = cursor.fetchall()
    cursor.execute("SELECT ST_X(ST_AsText(%s)), ST_Y(ST_AsText(%s))"
                   ,(farthest_block_gun_crime[0][2],farthest_block_gun_crime[0][2]))
    farthest_block_gun_crime_location = cursor.fetchall()
    folium.Marker(location=(police_station[0],police_station[1]),popup=
                  folium.Popup(html="Police Station <br> District No.:%s <br> Farthest Gun_Crime Block:%s"%\
                 (farthest_block_gun_crime[0][0],farthest_block_gun_crime[0][1]))).add_to(farthest_block_gun_crime_map)
    folium.CircleMarker(farthest_block_gun_crime_location[0],radius=5,color='#ff3187',popup=
                  folium.Popup(html="District No.:%s <br> Block:%s"\
            %(farthest_block_gun_crime[0][0],farthest_block_gun_crime[0][1]))).add_to(farthest_block_gun_crime_map) 
    
In [25]:
farthest_block_gun_crime_map
Out[25]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Requirements

The HTML document your are submitting must have the source code and the output for the following requirements

Requirement #1: (20 points)

  • Locate the Block that has the higest number of gun crimes. The popup on Choropleth map shall display the Block in every district along with the total number of gun crimes for that block
In [57]:
gun='%GUN%'
#execute query to get all occurrences
cursor.execute("""SELECT district,Block, count(Block)\
                  FROM crimes\
                  WHERE DESCRIPTION::text LIKE %s \
                  GROUP BY  district, Block""", [gun])
max_gun_crimes = cursor.fetchall()
# converted to datframe
max_gun_crimes_df= pd.DataFrame(max_gun_crimes, columns=['district', 'Block','maxBlock'])
max_gun_crimes_df = max_gun_crimes_df.sort_values('maxBlock', ascending =False)
max_gun_crimes_df = max_gun_crimes_df.drop_duplicates(subset='district', keep='first', inplace=False)
max_gun_crimes_df = max_gun_crimes_df.sort_values('district', ascending = True)
max_gun_crimes_df['district'] = max_gun_crimes_df['district'].astype(str)
max_gun_crimes_df
Out[57]:
district Block maxBlock
516 1 008XX S PARK TER 1
460 2 044XX S GREENWOOD AVE 2
562 3 071XX S WABASH AVE 3
82 4 013XX E 87TH ST 2
287 5 002XX W 112TH ST 2
692 6 078XX S ASHLAND AVE 3
666 7 067XX S ASHLAND AVE 2
390 8 064XX S ROCKWELL ST 2
97 9 002XX W 43RD ST 2
568 10 015XX S KOLIN AVE 3
464 11 009XX N HAMLIN AVE 3
315 12 013XX W 18TH ST 2
473 14 026XX W DIVISION ST 1
597 15 006XX S CICERO AVE 2
399 16 061XX W BERENICE AVE 2
471 17 035XX W WILSON AVE 1
504 18 0000X W ILLINOIS ST 1
440 19 035XX N BROADWAY 1
453 20 051XX N BROADWAY 2
700 22 114XX S CARPENTER ST 2
0 24 072XX N BELL AVE 1
409 25 027XX N NORDICA AVE 2

As one can see from the table above the Block 071XX S WABASH AVE in Districts 11, 009XX N HAMLIN AVE in District 11, 015XX S KOLIN AVE in district 10 and 078XX S ASHLAND AVE in district 6 have a max count of 3 gun crimes.

In [64]:
cursor.execute("""SELECT ST_X(ST_AsText(Where_IS)), ST_Y(ST_AsText(Where_IS)),
    district from police_stations where district!='Headquarters'""")
police_stations = cursor.fetchall()
police_stations_df = pd.DataFrame(police_stations, columns=['lat', 'long', 'district'])
max_crimes_map = folium.Map(location =(41.8781, -87.6298),zoom_start=11)

max_crimes_map.choropleth(geo_data="Boundaries.geojson", 
              fill_color='YlOrRd', 
              fill_opacity=0.5, 
              line_opacity=1,
              data = max_gun_crimes_df,
              key_on='feature.properties.dist_num',
              columns = ['district', 'maxBlock'],
              legend_name="UNLAWFUL POSS OF HANDGUN"
              )
gun='%GUN%'
#did a loop to only print out one marker per district
for district in max_gun_crimes_df['district'].unique():
    maxheader = max_gun_crimes_df[max_gun_crimes_df['district'] == district].to_html(classes=
        'table table-striped table-hover table-condensed table-responsive')
    #had to convert to int64 type to compare
    locbool = police_stations_df['district'].astype('int64') == float(district)
    #added marker to location based on police lat and long, then added subset of datframe using maxheader
    folium.Marker(
        location=(police_stations_df[locbool]['lat'], police_stations_df[locbool]['long']),
        popup=folium.Popup(html="District Number %s - Max Crimes %s"
        %(int(district), maxheader))).add_to(max_crimes_map)
In [65]:
max_crimes_map
Out[65]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Requirement #2: (5 points)

  • Calculate the gun crimes density in every district
In [28]:
district = []
tarea = []
#Helps Find Area
with open('Boundaries.geojson') as f:
    data = json.load(f)
    a = data['features']
    for i in range(len(a)):
        obj=a[i]['geometry']
        n= a[i]['properties']
        district.append(n['dist_num'])
        tarea.append(area(obj)/10000)
#Make Dataframe out of Dict
af = pd.DataFrame({'dist_num': district,'district_area_inHectares':tarea})
af['dist_num'] = af['dist_num'].astype(str)
#Merge Gun Data Set with Dataset just created
final_data= pd.merge(af, districts_gun_violent_crimes_df , on='dist_num', how='inner')
#Calculate density
final_data['gun_density'] = round(final_data['gun_crimes']/(final_data['district_area_inHectares']/100))
final_data
Out[28]:
dist_num district_area_inHectares gun_crimes gun_density
0 17 2492.727155 7 0.0
1 20 1132.170216 5 0.0
2 19 2225.035732 11 0.0
3 25 2827.989237 43 2.0
4 14 1555.869965 12 1.0
5 7 1688.670732 67 4.0
6 3 1576.063931 57 4.0
7 4 7068.152865 50 1.0
8 6 2099.682124 63 3.0
9 22 3490.416073 24 1.0
10 5 3318.613379 44 1.0
11 24 1406.081387 9 1.0
12 16 8171.776367 12 0.0
13 8 5992.169760 49 1.0
14 18 1215.520046 14 1.0
15 12 2509.453028 38 2.0
16 11 1582.727274 69 4.0
17 15 989.631393 34 3.0
18 10 2038.988883 58 3.0
19 1 1214.818895 6 0.0
20 9 3505.216898 39 1.0
21 2 1949.690970 34 2.0

Requirement #3: (10 points)

  • Locate the farthest UNLAWFUL POSS OF HANDGUN crime from the police station in every district. The popup on Choropleth map shall display the district number and the block
In [41]:
gun='%UNLAWFUL POSS OF HANDGUN%'
cursor.execute("""SELECT district, count(district) \
                  FROM crimes\
                  WHERE DESCRIPTION::text LIKE %s\
                  GROUP BY district""", [gun])
farthest_block = cursor.fetchall()
districts_farthest_gun_violent_crimes_df = pd.DataFrame(farthest_block, columns=['district_num', 'block_guns'])
districts_farthest_gun_violent_crimes_df['district_num'] = districts_farthest_gun_violent_crimes_df['district_num'].astype(str)
In [42]:
farthest_block_unlawful_poss_map = folium.Map(location =(41.8781, -87.6298),zoom_start=11)
farthest_block_unlawful_poss_map.choropleth(geo_data="Boundaries.geojson", 
              fill_color='YlOrRd', 
              fill_opacity=0.5, 
              line_opacity=1,
              data = districts_farthest_gun_violent_crimes_df,
              key_on='feature.properties.dist_num',
              columns = ['district_num', 'block_guns'],
              legend_name="UNLAWFUL POSS OF HANDGUN"
              )

cursor.execute("""SELECT ST_X(ST_AsText(Where_IS)), ST_Y(ST_AsText(Where_IS)),
    district from police_stations where district!='Headquarters'""")
police_stations = cursor.fetchall()


for police_station in police_stations:
    #Tried Try Block as some queries do not return data from a district
    #Changed description to what was asked 
    try:
        cursor.execute("""SELECT DISTINCT on (A.block) A.district,A.block, A.where_is,ST_Distance(A.where_is,B.where_is)\
        FROM crimes as A, police_stations as B 
        WHERE A.district=%s and DESCRIPTION = 'UNLAWFUL POSS OF HANDGUN' and B.district= %s
                             and ST_Distance(A.where_is,B.where_is) 
        IN 
            ( SELECT max(dist) 
            FROM 
                (SELECT ST_Distance(A.where_is,B.where_is) as dist 
                FROM crimes as A, police_stations as B 
                WHERE A.district=%s  and DESCRIPTION='UNLAWFUL POSS OF HANDGUN' and B.district= %s) as f)""",
                       [police_station[2], police_station[2],police_station[2], police_station[2]])
    
        farthest_block_unlawful_poss = cursor.fetchall()
        cursor.execute("SELECT ST_X(ST_AsText(%s)), ST_Y(ST_AsText(%s))"
                       ,(farthest_block_unlawful_poss[0][2],farthest_block_unlawful_poss[0][2]))
        farthest_block_unlawful_poss_location = cursor.fetchall()
    except IndexError:
        continue
    folium.Marker(location=(police_station[0],police_station[1]),popup=
                  folium.Popup(html="Police Station <br> District No.:%s <br> Farthest Possession_Crime Block:%s"%\
                 (farthest_block_unlawful_poss[0][0],farthest_block_unlawful_poss[0][1]))).add_to(farthest_block_unlawful_poss_map)
    folium.CircleMarker(farthest_block_unlawful_poss_location[0],radius=5,color='#ff3187',popup=
                  folium.Popup(html="District No.:%s <br> Block:%s"\
            %(farthest_block_unlawful_poss[0][0],farthest_block_unlawful_poss[0][1]))).add_to(farthest_block_unlawful_poss_map) 
   

   
In [43]:
farthest_block_unlawful_poss_map
Out[43]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Requirement #4: (15 points)

  • Create Marker Clusters on Choropleth map for those gun related crimes that have Location Desciption as RESIDENCE in (green icon) and those that have Location Desciption as STREET in (red icon)
In [31]:
gun_crime_arrests_map = folium.Map(location =(41.8781, -87.6298),zoom_start=11)
gun_crime_arrests_map.choropleth(geo_data="Boundaries.geojson", 
              fill_color='YlOrRd', 
              fill_opacity=0.5, 
              line_opacity=1,
              data = districts_gun_violent_crimes_df, #can use data from #3
              key_on='feature.properties.dist_num',
              columns = ['dist_num', 'gun_crimes'],
              legend_name="Gun Related Crimes"
              )

cursor.execute("""SELECT ST_X(ST_AsText(Where_IS)), ST_Y(ST_AsText(Where_IS)),
    district from police_stations where district!='Headquarters'""")



police_stations = cursor.fetchall()

marker_cluster = MarkerCluster().add_to(gun_crime_arrests_map)
gun ='%GUN%' #find gun phrase

for police_station in police_stations:
    police_station_location = (police_station[0],police_station[1])
    gun ='%GUN%'
    cursor.execute("""SELECT DISTINCT ON(caseno) caseno, block,DESCRIPTION, count(arrest), arrest,latitude,longitude, LOCATION_DESCRIPTION\
                    FROM crimes\
                    WHERE district=%s and DESCRIPTION::text LIKE %s \
                    GROUP BY caseno,block, DESCRIPTION,arrest, latitude, longitude, LOCATION_DESCRIPTION""",[police_station[2],gun])
    crimes_per_district = cursor.fetchall()
    #print(crimes_per_district)
    for crime in crimes_per_district:
        #Condition changed to find residence location and index was 7 out of the tuple,
        if "RESIDENCE" in crime[7]:
            folium.Marker(location=(crime[5],crime[6]),popup=
                    folium.Popup(html="District No: %s <br> Description: %s <br> Block: %s <br> Location Desciption: %s"
                    %(police_station[2],crime[2],crime[1], crime[7])),icon=folium.Icon(color='green',
                    icon='ok-sign'),).add_to(marker_cluster)
        
        else:
            #Changed condition to find street location
            if "STREET"in crime[7]:
                folium.Marker(location=(crime[5],crime[6]),popup=folium.Popup(html=
                        "District No: %s <br> Description: %s<br> Block: %s <br> Location Desciption: %s"
                        %(police_station[2],crime[2],crime[1], crime[7])),icon=folium.Icon(color='red',
                        icon='remove-sign'),).add_to(marker_cluster)
gun_crime_arrests_map
Out[31]:
Make this Notebook Trusted to load map: File -> Trust Notebook